2024 Voter Turnout: Part 2 – Sivuyile Nzimeni

1 INTRODUCTION

In the previous voter turnout post, we provided a vignette on the how to visualise voter turnout from the 2024 South African National and Provincial Elections (NPE). The election results had a seismic shift in the political landscape, introducing coalition government at National and Provincial levels. Coalitions were introduced at scale in the local government elections in 2016 (Joubert 2016).

Interestingly, both elections had a Jacob Zuma effect, first in 2016 with his historically low approval ratings and again in 2024 through the formation of a new political entity Mkhonto WeSizwe Party¹. Another often under reported headline in the election is voter turnout. In this post focuses on the interaction between the 2022 Census Results and 2024 National and Provincial Results. The goal is to determine whether there were predictors in the 2022 Census that can explain the regional differences in voter turnout.

2 DATA COLLECTION

We rely on two primary governmental sources, Statistics South Africa (StatsSA) and the Independent Electoral Commission (IEC). The Census Dataset available from the StatsSA website. The post-enumeration and publication of results has not been without controversy. Moultrie and Dorrington (2024) found flaws in the census results with policy and planning implications, including an over estimation of the overall population, demographic inaccuracies and inconsistent sub-national spatial estimates among other issues.

Beyond these reported issues, the 2022 Census 10% Sample omits some crucial variables such Income, Water Interruptions longer than two consecutive days, Labour Outcomes and Mortality and Fertility. Since the publication of the report, there has been a flurry of activity among Academics, Media and Policy Planners to try and mitigate the above mentioned issues.

StatsSA has also retracted the `SA at a Glance: Census 2022` report lending credence to concerns raised about the Census 2022 data. The ramifications of overestimating and underestimating populations are far-reaching both for business and society. For our purposes, we need to consider the data as estimates rather than Census until the adjusted data is published.

Despite an Electoral Court an unsuccessful court case alleging irregularities (see Zondi and Steyn (2024)), the IEC data does not contain any material data collection controversy. Spatially joining the datasets presents the first challenge. The electoral results are present across voting districts, a sub-ward level area, which enable efficient voting while the 2022 Census 10% has the municipality as the lowest level of observation.

To enable a join, it is possible to aggregate the voter turnout data to municipal level and compare outcomes at the same level. However, this decision has a trade-off; nuance. South Africa is a spatially fragmented nation, intra-municipal socio-economic outcomes can vary greatly, even between contiguous neighbours. Similarly, voting behaviour can follows a similar pattern.

One approach to overcome this hurdle is dasysmetric mapping, where we effective diffuse a socio-economic outcomes across the estimated population distribution in an area. For our purposes, it is suffice to aggregate the voting data to the same level as the census sample dataset.

2.1 DATA PREPROCESSING

Since the data collection methods differ across the two datasets. It is important to consider the special nature of the Census data; it is effectively a complex survey. Zimmer (2024) will serve as an important starting point for aggregating and estimating socio-economic differences across municipalities.

see code

lapply(c("tidyverse","janitor","sf","spdep",
         "sfdep","ggthemes","fishualize","showtext",
         "tmap","haven","psych"),
       require,
       character.only = TRUE) |> 
  suppressWarnings() |> 
  suppressMessages()

theme_set(theme_minimal())
sf_use_s2(FALSE)

see code

Ward_Turnout <- readRDS(file = "data/SA_Wards_Turn_Out_Difference_1_2024-06-17.rds")[,c("province","cat_b","municipali","ward_no","district","district_co","ward","turnout_diff","geometry")]

Household_Data <- read_dta(file = 'data/sa-census-2022-v1-stata/sa-census-2022-household-v1.dta')

Geodataset <- read_dta("data/sa-census-2022-v1-stata/sa-census-2022-geography-v1.dta")


Municipal_Turnout <- Ward_Turnout |> 
  mutate(turnout_diff = ifelse(is.na(turnout_diff),0,turnout_diff)) |> 
  group_by(province,cat_b,municipali) |> 
  summarise(turnout_diff = median(turnout_diff),
            .groups="drop") |> 
  sf::st_drop_geometry()

although coordinates are longitude/latitude, st_union assumes that they are
planar

References

Joubert, Jan-Jan. 2016. ‘Stronger Maimane and DA Plan ANC Downfall’. Sunday Times, December, 4. https://discover-sabinet-co-za.ufs.idm.oclc.org/document/10431380.

Moultrie, Tom, and Rob Dorrington. 2024. The 2022 South African Census: A Technical Report Prepared for the South African Medical Research Council. Centre for Actuarial Research: University of Cape Town. https://www.samrc.ac.za/sites/default/files/attachments/2024-07/CensusReport.pdf.

Njilo, Nonkuleko. 2024. ‘Explainer - What We Know about Jacob Zuma’s New Party’. Daily Maverick, January. https://dailymaverick.co.za/article/2024-01-09-umkhonto-wesizwe-what-we-know-about-zumas-new-party/.

Zimmer, Stephanie. 2024. Exploring Complex Survey Data Analysis Using R: A Tidy Introduction with srvyr And survey. 1st ed. Milton: CRC Press LLC.

Zondi, DP, and AJJ Steyn. 2024. Umkhonto Wesizwe Political Party v the Electoral Commission of South Africa and Others. 0034/24EC. https://www.saflii.org/za/cases/ZAEC/2024/26.htm.

Footnotes

See Njilo (2024)↩︎

--- title: "2024 Voter Turnout: Part 2" lang: "en-GB" date: "December 11,2024" description: "An extension of the the previous voter turnout analysis, this post focuses on the intersection between voter turnout and socio-economic outcomes. " categories: [spatial data,data cleaning,visualisation] bibliography: references.bib --- ## INTRODUCTION In the previous [voter turnout](https://gqu3.github.io/Personal_Site_3/posts/2024%20Election%20Results/) post, we provided a vignette on the how to visualise voter turnout from the 2024 South African National and Provincial Elections (NPE). The election results had a seismic shift in the political landscape, introducing coalition government at National and Provincial levels. Coalitions were introduced at scale in the local government elections in 2016 [@joubert2016]. Interestingly, both elections had a `Jacob Zuma effect`, first in 2016 with his historically low approval ratings and again in 2024 through the formation of a new political entity Mkhonto WeSizwe Party[^1]. Another often under reported headline in the election is voter turnout. In this post focuses on the interaction between the 2022 Census Results and 2024 National and Provincial Results. The goal is to determine whether there were predictors in the 2022 Census that can explain the regional differences in voter turnout. [^1]: See @njilo2024 ## DATA COLLECTION We rely on two primary governmental sources, *Statistics South Africa (StatsSA)* and the *Independent Electoral Commission (IEC)*. The [Census Dataset](https://census.statssa.gov.za/#/) available from the StatsSA website. The post-enumeration and publication of results has not been without controversy. @moultrie2024 found flaws in the census results with policy and planning implications, including an over estimation of the overall population, demographic inaccuracies and inconsistent sub-national spatial estimates among other issues. Beyond these reported issues, the [2022 Census 10% Sample](https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/982) omits some crucial variables such Income, Water Interruptions longer than two consecutive days, Labour Outcomes and Mortality and Fertility. Since the publication of the report, there has been a flurry of activity among Academics, Media and Policy Planners to try and mitigate the above mentioned issues. StatsSA has also retracted the \`SA at a Glance: Census 2022\` report lending credence to concerns raised about the Census 2022 data. The ramifications of overestimating and underestimating populations are far-reaching both for business and society. For our purposes, we need to consider the data as estimates rather than *Census* until the adjusted data is published. Despite an Electoral Court an unsuccessful court case alleging irregularities (see @zondi2024), the IEC data does not contain any material data collection controversy. Spatially joining the datasets presents the first challenge. The electoral results are present across voting districts, a sub-ward level area, which enable efficient voting while the 2022 Census 10% has the municipality as the lowest level of observation. To enable a join, it is possible to aggregate the voter turnout data to municipal level and compare outcomes at the same level. However, this decision has a trade-off; nuance. South Africa is a spatially fragmented nation, intra-municipal socio-economic outcomes can vary greatly, even between contiguous neighbours. Similarly, voting behaviour can follows a similar pattern. One approach to overcome this hurdle is dasysmetric mapping, where we effective diffuse a socio-economic outcomes across the estimated population distribution in an area. For our purposes, it is suffice to aggregate the voting data to the same level as the census sample dataset. ### DATA PREPROCESSING Since the data collection methods differ across the two datasets. It is important to consider the special nature of the Census data; it is effectively a complex survey. @zimmer2024 will serve as an important starting point for aggregating and estimating socio-economic differences across municipalities. ```{r Libraries} #| echo: true #| include: true #| warning: false #| message: false #| results: false lapply(c("tidyverse","janitor","sf","spdep", "sfdep","ggthemes","fishualize","showtext", "tmap","haven","psych"), require, character.only = TRUE) |> suppressWarnings() |> suppressMessages() theme_set(theme_minimal()) sf_use_s2(FALSE) ``` ```{r Import_Data} Ward_Turnout <- readRDS(file = "data/SA_Wards_Turn_Out_Difference_1_2024-06-17.rds")[,c("province","cat_b","municipali","ward_no","district","district_co","ward","turnout_diff","geometry")] Household_Data <- read_dta(file = 'data/sa-census-2022-v1-stata/sa-census-2022-household-v1.dta') Geodataset <- read_dta("data/sa-census-2022-v1-stata/sa-census-2022-geography-v1.dta") Municipal_Turnout <- Ward_Turnout |> mutate(turnout_diff = ifelse(is.na(turnout_diff),0,turnout_diff)) |> group_by(province,cat_b,municipali) |> summarise(turnout_diff = median(turnout_diff), .groups="drop") |> sf::st_drop_geometry() ```